[fips-2021-10-20-1MU] Make SSL_select_next_proto more robust to invalid calls. #1678
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
SSL_select_next_proto has some fairly complex preconditions:
In the context of how this function is meant to be used, these are reasonable preconditions. The caller should not serialize its own list wrong, and it makes no sense to try to negotiate a protocol when you don't support any protocols. In particular, it complicates NPN's weird "opportunistic" protocol.
However, the preconditions are unchecked and a bit subtle. Violating them will result in memory errors. Bad syntax on the protocol lists is mostly not a concern (you should encode your own list correctly and the library checks the peer's list), but the second rule is somewhat of a mess in practice:
Despite having the same precondition in reality, OpenSSL did not document this. Their documentation implies things which are impossible without this precondition, but they forgot to actually write down the precondition. There's an added complexity that OpenSSL never updated the parameter names to match the role reversal between ALPN and NPN.
There are thus a few cases where a buggy caller may pass an empty "supported" list.
An NPN client called SSL_select_next_proto despite not actually supporting any NPN protocols.
An NPN client called SSL_select_next_proto, flipped the parameters, and the server advertised no protocols.
An ALPN server called SSL_select_next_proto, passed its own list in as the second parameter, despite not actually supporting any ALPN protocols.
In these scenarios, the "opportunistic" protocol returned in the OPENSSL_NPN_NO_OVERLAP case will be out of bounds. If the caller discards it, this does not matter. If the caller returns it through the NPN or ALPN selection callback, they have a problem. ALPN servers are expected to discard it, though some may be buggy. NPN clients may implement either behavior.
Older versions of some callers have exhibited variations on the above mistakes, so empirically folks don't always get it right. OpenSSL's wrong documentation also does not help matters. Instead, have SSL_select_next_proto just check these preconditions. That is not a performance-sensitive function and these preconditions are easy to check. While I'm here, rewrite it with CBS so it is much more straightforwardly correct.
What to return when the preconditions fail is tricky, but we need to output some protocol, so we output the empty protocol. This, per the previous test and doc fixes, is actually fine in NPN, so one of the above buggy callers is not retroactively made OK. But it is not fine in ALPN, so we still need to document that callers need to avoid this state. To that end, revamp the documentation a bit.
Thanks to Joe Birr-Pixton for reporting this!
Fixed: 735
Change-Id: I4378a082385e7334e6abaa6705e6b15d6843f6c5 Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/69028
Reviewed-by: Bob Beck [email protected]
Commit-Queue: David Benjamin [email protected]
(cherry picked from commit c1d9ac02514a138129872a036e3f8a1074dcb8bd)